-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
security: replace vulnerable regex with parser #1223
Conversation
Problem: link regex was vulnerable Solution: dedicated parser Fixes: markedjs#1222
If this is merged, a new CI on #1220 should come back clean. |
I feel like this complicates the code a lot. Is there no way we can do this with a constructed regex? |
I spent about 30 minutes tinkering with regex variations but couldn't find one. The difficulty is that the "destination" can contain characters overlapping with the "title". The relatively straightforward regex I replaced captured the specification (reasonably) appropriately, though it actually cheated in a few aspects (e.g. that a title can be opened by |
lib/marked.js
Outdated
if (m = destinationRe.exec(destination)) { | ||
// <destination> -> destination | ||
var dest2 = m[1].trim(); | ||
destination = unwrapCarats(dest2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destination
is assigned but never used. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Will fix.
lib/marked.js
Outdated
.replace('label', inline._label) | ||
.getRegex(); | ||
|
||
function unwrapCarats (str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Carets?? Carat and carrot are different. :)
Angle brackets might be most appropriate if I'm reading the regex correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha I was wondering if I was spelling that right. I'll switch to AngleBrackets anyway.
lib/marked.js
Outdated
.replace('label', inline._label) | ||
.getRegex(), | ||
link: { | ||
exec: function (s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding some doc blocks to introduce the why behind some of this...nothing too major, just to help those new to the code.
@styfle @joshbruce fbf93a8 addresses your comments. |
lib/marked.js
Outdated
} | ||
|
||
var destinationRe = /^(<?[\s\S]*>?)/; | ||
if (m = destinationRe.exec(destination)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two if blocks
are nearly identical. Can you make a common function for those? Something like this:
function getMatch(r, fullMatch) {
var m = r.exec(fullMatch[2]);
if (m) {
var dest = unwrapAngleBrackets(m[1].trim());
var title = m[2];
return [fullMatch[0], fullMatch[1], dest, title];
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but not a deal breaker for my review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/marked.js
Outdated
} | ||
} | ||
|
||
if (match) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you flip this so that the if statement is smaller:
if (!match) {
return null;
}
// ... get dest and title here
return [dest, title];
lib/marked.js
Outdated
} | ||
|
||
var fullMatch = generalLinkRe.exec(s); | ||
if (fullMatch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you flip this so that the if statement is smaller:
if (!fullMatch) {
return null;
}
// ... split and such here
return [fullMatch[0], text, destinationAndTitle[0], destinationAndTitle[1]];
lib/marked.js
Outdated
match = parsingRegexes[i].exec(destination); | ||
if (match) { | ||
dest = match[1]; | ||
title = match[2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to assign dest and title here. Simply use match
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@UziTech Can you take a look at the code now? Any other reservations? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason marked is so fast is because of the speed of regexes. I just don't want this type of thing to become the norm. There is no reason to have another slow markdown parser.
@UziTech Good point. Why is CI not running on this PR? |
Not sure. That seems to be happening a lot lately. Maybe a travis bug? |
@UziTech: Good point. I think parsers like this should be done as a last resort unless we can demonstrate through benchmarking that there isn't a big performance difference...I don't think we should get in the position of asserting something is more performant in all cases. The CI thing is weird. Is it possible that the key thing is causing issues? |
Looks like this failed lint: https://travis-ci.org/markedjs/marked/builds/367190715 @davisjam Can you try running lint and tests on your machine and fix any issues? |
Lint passes now, that CI was on a previous version. |
* security: replace vulnerable regex with parser Problem: link regex was vulnerable Solution: dedicated parser Fixes: markedjs#1222
Problem: link regex was vulnerable
Solution: dedicated parser
Fixes: #1222